Illinois Cross-Lingual Wikifier: Grounding Entities in Many Languages to the English Wikipedia

نویسندگان

  • Chen-Tse Tsai
  • Dan Roth
چکیده

We release a cross-lingual wikification system for all languages in Wikipedia. Given a piece of text in any supported language, the system identifies names of people, locations, organizations, and grounds these names to the corresponding English Wikipedia entries. The system is based on two components: a cross-lingual named entity recognition (NER) model and a crosslingual mention grounding model. The cross-lingual NER model is a language-independent model which can extract named entity mentions in the text of any language in Wikipedia. The extracted mentions are then grounded to the English Wikipedia using the cross-lingual mention grounding model. The only resources required to train the proposed system are the multilingual Wikipedia dump and existing training data for English NER. The system is online at http://cogcomp.cs.illinois.edu/page/demo_view/xl_wikifier

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Better Name Translation for Cross-Lingual Wikification

A notable challenge in cross-lingual wikification is the problem of retrieving English Wikipedia title candidates given a non-English mention, a step that requires translating names written in a foreign language into English. Creating training data for name translation requires significant amount of human efforts. In order to cover as many languages as possible, we propose a probabilistic model...

متن کامل

Cross-lingual Wikification Using Multilingual Embeddings

Cross-lingual Wikification is the task of grounding mentions written in non-English documents to entries in the English Wikipedia. This task involves the problem of comparing textual clues across languages, which requires developing a notion of similarity between text snippets across languages. In this paper, we address this problem by jointly training multilingual embeddings for words and Wiki...

متن کامل

Illinois Cognitive Computation Group UI-CCG TAC 2013 Entity Linking and Slot Filler Validation Systems

In this paper, we describe the University of Illinois (UI CCG) submission to the 2013 TAC KBP English Entity Linking (EL) and Slot Filler Validation (SFV) tasks. We developed two separate systems. Our Entity Linking system integrates an improved version of the Illinois Wikifier with additional functionality to identify and cluster entity mentions that do not correspond to entries in the referen...

متن کامل

Neural Cross-Lingual Entity Linking

A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compu...

متن کامل

Cross-lingual Alignment and Completion of Wikipedia Templates

For many languages, the size of Wikipedia is an order of magnitude smaller than the English Wikipedia. We present a method for cross-lingual alignment of template and infobox attributes in Wikipedia. The alignment is used to add and complete templates and infoboxes in one language with information derived from Wikipedia in another language. We show that alignment between English and Dutch Wikip...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016